Automatic speaker identification using the magnitude and phase spectra of inverse - filtered voiced speech
نویسنده
چکیده
A speaker identification experiment was performed with the 10 digits of the TI46 data corpus, spoken in isolation by 8 female and 8 male speakers. From each voiced frame of an utterance, a single fundamental period was extracted automatically and inverse-filtered to obtain an approximation of the glottal excitation function. Differently sized feature vectors were derived from the magnitude and phase spectra of the glottal excitation function. They were considered by themselves and in combination with the fundamental frequency. The best identification results from the excitation spectra alone were 28% for the 8 female speakers and 34% for the 8 male speakers. When the feature vectors were extended to also contain the fundamental frequency of the frame, the identification results for the 8 female speakers increased to 47% and for the 8 male speakers to 37%.
منابع مشابه
Tailoring kalman filtering towards speaker characterisation
This paper describes a method for obtaining smoothed vocal tract parameters from analysis during the closed phase of the glottis. The method is based upon Expectation Maximisation (EM) and uses Kalman-Rauch forward-backward iterations through a voiced segment, in which the speech data during excitation and open phases are excluded by treating them as ‘missing data’. This approach exploits the n...
متن کاملUsable Speech Assignment for Speaker Identification System
Usable speech criteria are proposed to extract minimally corrupted speech for speaker identification in cochannel speech. Extracted usable segments are separated in time and need to be organized into speaker streams for speaker identification system. In this paper, we focus to organize extracted usable speech segment into a single stream for the same speaker by speaker assignment system. We ext...
متن کاملVoice Source State as a Source of Information in Speech Recognition: Detection of Laryngealizations
Laryngealizations are irregular voiced portions of speech, which can have morphosyntactic functions and can disturb the automatic computation of F0. Two methods for the automatic detection of laryngealizations are described in this paper: With a Gaussian classifier using spectral and cepstral features a recognition rate of 80% (false alarm rate of 8%) could be achieved. As an alternative a “non...
متن کاملThe effectiveness of higher order spectral phase features in speaker identification
This paper studies the effectiveness of higher order spectra (HOS) phase features in the task of speaker identification. Within the speech processing community, short time spectral phase information is generally regarded as unimportant for speaker recognition. In fact, the most commonly used features for speaker recognition are the Mel frequency cepstral coefficients (MFCC), which are defined f...
متن کاملHigher Order Spectral Phase Features for Speaker Identification
This paper investigates the use of higher order spectra (HOS) phase features in the task of speaker identification. Within the speech processing community, short time spectral phase information is widely regarded as unimportant for speaker recognition. Features are generally defined from the magnitude spectrum only. This paper utilises features that contain both magnitude and phase spectral inf...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012